Gradient descent tries to find the lowest poiht in a fitness landscape by following the direction with the fastest drop in value. It is equivalent to gradient ascent, but with the fitness function inverted. Simulated annealing is form of probabalistic gradient descent where the gradient is used to determine the probability of following the dircetion, rtaher than always following the very fastest route downwards. Backpropagation can also be viewed as a form of gradient descent minimsiing the difference between teh actualand expected outputs of the neural network, but which is perfromed incrementally for each training example in turn.
Defined on page 184
Used on Chap. 7: page 142; Chap. 9: pages 184, 185, 187; Chap. 12: page 280